Boosted Learning in Dynamic Bayesian Networks for Multimodal Speaker Detection

نویسندگان

  • ASHUTOSH GARG
  • JAMES M. REHG
چکیده

Bayesian network models provide an attractive framework for multimodal sensor fusion. They combine an intuitive graphical representation with efficient algorithms for inference and learning. However, the unsupervised nature of standard parameter learning algorithms for Bayesian networks can lead to poor performance in classification tasks. We have developed a supervised learning framework for Bayesian networks, which is based on the Adaboost algorithm of Schapire and Freund. Our framework covers static and dynamic Bayesian networks with both discrete and continuous states. We have tested our framework in the context of a novel multimodal HCI application: a speech-based command and control interface for a Smart Kiosk. We provide experimental evidence for the utility of our boosted learning approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boosting and Structure Learning in Dynamic Bayesian Networks for Audio-Visual Speaker Detection

Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with ef£cient algorithms for inference and learning. Earlier work has demonstrated that boosted parameter learning could be used to improve the performance of Bayesian network classi£ers for complex multi-modal inference problems such as speaker detection. In speaker detect...

متن کامل

Multimodal Speaker Detection using Error Feedback Dynamic Bayesian Networks

Design and development of novel human-computer interfaces poses a challenging problem: actions and intentions of users have to be inferred from sequences of noisy and ambiguous multi-sensory data such as video and sound. Temporal fusion of multiple sensors has been efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be com...

متن کامل

Multimodal Speaker Detection Using Error Feedback Dynamic Bayesian Networks

Design and development of novel human-computer interfaces poses a challenging problem: actions and intentions of users have to be inferred from sequences of noisy and ambiguous multi-sensory data such as video and sound. Temporal fusion of multiple sensors has been efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be com...

متن کامل

Multimodal Speaker Detection Using Input/Output Dynamic Bayesian Networks

Inferring users’ actions and intentions forms an integral part of design and development of any human-computer interface. The presence of noisy and at times ambiguous sensory data makes this problem challenging. We formulate a framework for temporal fusion of multiple sensors using input–output dynamic Bayesian networks (IODBNs). We find that contextual information about the state of the comput...

متن کامل

Floor holder detection and end of speaker turn prediction in meetings

We propose a novel fully automatic framework to detect which meeting participant is currently holding the conversational floor and when the current speaker turn is going to finish. Two sets of experiments were conducted on a large collection of multiparty conversations: the AMI meeting corpus. Unsupervised speaker turn detection was performed by post-processing the speaker diarization and the s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001